Cascade Markov random fields for stroke extraction of Chinese characters

نویسندگان

Jia Zeng

Wei Feng

Lei Xie

Zhi-Qiang Liu

چکیده

Extracting perceptually meaningful strokes plays an essential role in modeling structures of handwritten Chinese characters for accurate character recognition. This paper proposes a cascade Markov random field (MRF) model that combines Preprint submitted to Elsevier 29 September 2009 both bottom-up (BU) and top-down (TD) processes for stroke extraction. In the lowlevel stroke segmentation process, we use a BUMRF model with smoothness prior to segment the character skeleton into directional substrokes based on self-organization of pixel-based directional features. In the high-level stroke extraction process, the segmented substrokes are sent to a TD MRF-based character model that, in turn, feeds back to guide the merging of corresponding substrokes to produce reliable candidate strokes for character recognition. The merit of the cascade MRF model is due to its ability to encode the local statistical dependencies of neighboring stroke components as well as prior knowledge of Chinese character structures. Encouraging stroke extraction and character recognition results confirm the effectiveness of our method, which integrates both BU/TD vision processing streams within the unified MRF framework.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hidden Markov Random Field Based Approach for Off-Line Handwritten Chinese Character Recognition

This paper presents a Hidden Markov Mesh Random Field (HMMRF) based approach for off-line handwritten Chinese characters recognition using statistical observation sequences embedded in the strokes of a character. Due to a large set of Chinese characters and many different writing styles, the recognition of handwritten Chinese characters is very challenging. In our approach, the binary image is ...

متن کامل

Transliteration Extraction from Classical Chinese Buddhist Literature Using Conditional Random Fields

Extracting plausible transliterations from historical literature is a key issues in historical linguistics and other resaech fields. In Chinese historical literature, the characters used to transliterate the same loanword may vary because of different translation eras or different Chinese language preferences among translators. To assist historical linguiatics and digial humanity researchers, t...

متن کامل

Combination of Machine Learning Methods for Optimum Chinese Word Segmentation

This article presents our recent work for participation in the Second International Chinese Word Segmentation Bakeoff. Our system performs two procedures: Out-ofvocabulary extraction and word segmentation. We compose three out-of-vocabulary extraction modules: Character-based tagging with different classifiers – maximum entropy, support vector machines, and conditional random fields. We also co...

متن کامل

A Run-Length Coding Based Approach to Stroke Extraction of Chinese Characters

Traditional stroke extraction approach usually adopts thinning technique as the preprocessing method in obtaining the skeletons of Chinese characters. However, thinning may produce spurious branches and multiple fork points at junctions. Such distortion will make stroke extraction process more complicate and unreliable. This paper proposes a novel run-length-based stroke extraction approach wit...

متن کامل

A Model of Stroke Extraction from Chinese Character Images

Given the large number and complexity of Chinese characters, pattern matching based on structural decomposition and analysis is believed to be necessary and essential to off-line character recognition. This paper proposes a new model of stroke extraction for Chinese characters. One problem for stroke extraction is how to extract primary strokes. Another major problem is to solve the segmentatio...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Inf. Sci.

دوره 180 شماره

صفحات -

تاریخ انتشار 2010

Cascade Markov random fields for stroke extraction of Chinese characters

نویسندگان

چکیده

منابع مشابه

Hidden Markov Random Field Based Approach for Off-Line Handwritten Chinese Character Recognition

Transliteration Extraction from Classical Chinese Buddhist Literature Using Conditional Random Fields

Combination of Machine Learning Methods for Optimum Chinese Word Segmentation

A Run-Length Coding Based Approach to Stroke Extraction of Chinese Characters

A Model of Stroke Extraction from Chinese Character Images

عنوان ژورنال:

اشتراک گذاری